construction task
Collaborating Action by Action: A Multi-agent LLM Framework for Embodied Reasoning
White, Isadora, Nottingham, Kolby, Maniar, Ayush, Robinson, Max, Lillemark, Hansen, Maheshwari, Mehul, Qin, Lianhui, Ammanabrolu, Prithviraj
Collaboration is ubiquitous and essential in day-to-day life -- from exchanging ideas, to delegating tasks, to generating plans together. This work studies how LLMs can adaptively collaborate to perform complex embodied reasoning tasks. To this end we introduce MINDcraft, an easily extensible platform built to enable LLM agents to control characters in the open-world game of Minecraft; and MineCollab, a benchmark to test the different dimensions of embodied and collaborative reasoning. An experimental study finds that the primary bottleneck in collaborating effectively for current state-of-the-art agents is efficient natural language communication, with agent performance dropping as much as 15% when they are required to communicate detailed task completion plans. We conclude that existing LLM agents are ill-optimized for multi-agent collaboration, especially in embodied scenarios, and highlight the need to employ methods beyond in-context and imitation learning. Our website can be found here: https://mindcraft-minecollab.github.io/
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- North America > Dominican Republic (0.04)
- Asia > China > Hong Kong (0.04)
- Research Report > New Finding (0.34)
- Research Report > Experimental Study (0.34)
APT: Architectural Planning and Text-to-Blueprint Construction Using Large Language Models for Open-World Agents
We present APT, an advanced Large Language Model (LLM)-driven framework that enables autonomous agents to construct complex and creative structures within the Minecraft environment. Unlike previous approaches that primarily concentrate on skill-based open-world tasks or rely on image-based diffusion models for generating voxel-based structures, our method leverages the intrinsic spatial reasoning capabilities of LLMs. By employing chain-of-thought decomposition along with multimodal inputs, the framework generates detailed architectural layouts and blueprints that the agent can execute under zero-shot or few-shot learning scenarios. Our agent incorporates both memory and reflection modules to facilitate lifelong learning, adaptive refinement, and error correction throughout the building process. To rigorously evaluate the agent's performance in this emerging research area, we introduce a comprehensive benchmark consisting of diverse construction tasks designed to test creativity, spatial reasoning, adherence to in-game rules, and the effective integration of multimodal instructions. Experimental results using various GPT-based LLM backends and agent configurations demonstrate the agent's capacity to accurately interpret extensive instructions involving numerous items, their positions, and orientations. The agent successfully produces complex structures complete with internal functionalities such as Redstone-powered systems. A/B testing indicates that the inclusion of a memory module leads to a significant increase in performance, emphasizing its role in enabling continuous learning and the reuse of accumulated experience. Additionally, the agent's unexpected emergence of scaffolding behavior highlights the potential of future LLM-driven agents to utilize subroutine planning and leverage the emergence ability of LLMs to autonomously develop human-like problem-solving techniques.
- Education > Educational Setting > Continuing Education (0.54)
- Leisure & Entertainment > Games > Computer Games (0.37)
MineLand: Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs
Yu, Xianhao, Fu, Jiaqi, Deng, Renjia, Han, Wenjuan
While Vision-Language Models (VLMs) hold promise for tasks requiring extensive collaboration, traditional multi-agent simulators have facilitated rich explorations of an interactive artificial society that reflects collective behavior. However, these existing simulators face significant limitations. Firstly, they struggle with handling large numbers of agents due to high resource demands. Secondly, they often assume agents possess perfect information and limitless capabilities, hindering the ecological validity of simulated social interactions. To bridge this gap, we propose a multi-agent Minecraft simulator, MineLand, that bridges this gap by introducing three key features: large-scale scalability, limited multimodal senses, and physical needs. Our simulator supports 64 or more agents. Agents have limited visual, auditory, and environmental awareness, forcing them to actively communicate and collaborate to fulfill physical needs like food and resources. Additionally, we further introduce an AI agent framework, Alex, inspired by multitasking theory, enabling agents to handle intricate coordination and scheduling. Our experiments demonstrate that the simulator, the corresponding benchmark, and the AI agent framework contribute to more ecological and nuanced collective behavior.The source code of MineLand and Alex is openly available at https://github.com/cocacola-lab/MineLand.
- Europe > Sweden > Skåne County > Malmö (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Florida > Orange County > Orlando (0.04)
- (3 more...)
Integration of 4D BIM and Robot Task Planning: Creation and Flow of Construction-Related Information for Action-Level Simulation of Indoor Wall Frame Installation
Oyediran, Hafiz, Turner, William, Kim, Kyungki, Barrows, Matthew
An obstacle toward construction robotization is the lack of methods to plan robot operations within the entire construction planning process. Despite the strength in modeling construction site conditions, 4D BIM technologies cannot perform construction robot task planning considering the contexts of given work environments. To address this limitation, this study presents a framework that integrates 4D BIM and robot task planning, presents an information flow for the integration, and performs high-level robot task planning and detailed simulation. The framework uniquely incorporates a construction robot knowledge base that derives robotrelated modeling requirements to augment a 4D BIM model. Then, the 4D BIM model is converted into a robot simulation world where a robot performs a sequence of actions retrieving construction-related information. A case study focusing on the interior wall frame installation demonstrates the potential of systematic integration in achieving context-aware robot task planning and simulation in construction environments. Simulated a mobile robot's actions to install wall frames in a residential building 1. Introduction Rapid advancements in robotics technologies are making the utilization of robots for dangerous, tedious, and repetitive tasks more and more practical [1]. Unlike traditional industrial robots with fixed behaviors, modern robots with mobile platforms, sensors, and actuators can be programmed to perform given tasks intelligently adapting to changing work environments. Many sectors, including manufacturing [2], rescue [3], agriculture [4], and healthcare [5], are adopting robots to automate existing processes to achieve greater productivity and safety. Many construction tasks are repetitive and labor-intensive by nature [7,8], and thus robotization of these tasks can potentially address many chronic problems, such as stagnant productivity growth [9], labor shortage [10], and work-related diseases/fatalities [11]. A growing number of robotic solutions are introduced by academic studies [12,13] and industrial applications (excavation and leveling [14], marking of layout [15], rebar tying [16], and bricklaying [17,18]). With this trend, construction sites are expected to become crowded with robots and human workers in the near future exposing human workers to robot-related hazards, such as collisions, crushing, trapping, mechanical part accidents, etc. [19]. In order to utilize robots safely and effectively in congested construction environments, both high-level task planning and detailed simulation of construction robots should be performed as part of the entire construction planning. Despite the abundant studies on the coordination between human work crews [20,21], none of the prior studies incorporated robot operations into construction planning process.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Nebraska > Lancaster County > Lincoln (0.04)
- North America > United States > Nebraska > Douglas County > Omaha (0.04)
- Health & Medicine (1.00)
- Construction & Engineering (1.00)
Enabling BIM-Driven Robotic Construction Workflows with Closed-Loop Digital Twins
Wang, Xi, Yu, Hongrui, McGee, Wes, Menassa, Carol C., Kamat, Vineet R.
The introduction of assistive construction robots can significantly alleviate physical demands on construction workers. Leveraging a Building Information Model (BIM) offers a natural and promising approach to driving a robotic construction workflow. However, because of uncertainties inherent in construction sites, such as discrepancies between the as-designed and as-built components, robots cannot solely rely on a BIM to plan and perform field construction work. Human workers are adept at improvising alternative plans with their creativity and experience and thus can assist robots in overcoming uncertainties and performing construction work successfully. In such scenarios, it is critical to continuously update the BIM as work processes unfold so that it includes as-built information for the ensuing construction and maintenance tasks. This research introduces an interactive closed-loop digital twin framework that integrates a BIM into human-robot collaborative construction workflows. The robot's functions are primarily driven by the BIM, but it adaptively adjusts its plans based on actual site conditions, while the human co-worker oversees and supervises the process. When necessary, the human co-worker intervenes in the robot's plan by changing the task sequence or workspace geometry or requesting a new motion plan to help the robot overcome the encountered uncertainties. Experiments involving block pick-and-place tasks are carried out to verify system performance using an industrial robotic arm in a research laboratory setting that mimics a construction site. In addition, a drywall installation case study is conducted to validate the system. Integrating the flexibility of human workers and the autonomy and accuracy afforded by BIMs, the proposed framework offers significant promise of increasing the robustness of construction robots in the performance of field construction work.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- North America > United States > Virginia > Fairfax County > Reston (0.04)
- North America > United States > Texas > Brazos County > College Station (0.04)
- (2 more...)
- Workflow (1.00)
- Research Report (0.84)
Habits of Mind: Reusing Action Sequences for Efficient Planning
When we exercise sequences of actions, their execution becomes more fluent and precise. Here, we consider the possibility that exercised action sequences can also be used to make planning faster and more accurate by focusing expansion of the search tree on paths that have been frequently used in the past, and by reducing deep planning problems to shallow ones via multi-step jumps in the tree. To capture such sequences, we use a flexible Bayesian action chunking mechanism which finds and exploits statistically reliable structure at different scales. This gives rise to shorter or longer routines that can be embedded into a Monte-Carlo tree search planner. We show the benefits of this scheme using a physical construction task patterned after tangrams.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- Asia > Myanmar > Andaman Sea (0.04)
- Workflow (0.73)
- Research Report (0.51)
MARC: A multi-agent robots control framework for enhancing reinforcement learning in construction tasks
Duan, Kangkang, Suen, Christine Wun Ki, Zou, Zhengbo
Letting robots emulate human behavior has always posed a challenge, particularly in scenarios involving multiple robots. In this paper, we presented a framework aimed at achieving multi-agent reinforcement learning for robot control in construction tasks. The construction industry often necessitates complex interactions and coordination among multiple robots, demanding a solution that enables effective collaboration and efficient task execution. Our proposed framework leverages the principles of proximal policy optimization and developed a multi-agent version to enable the robots to acquire sophisticated control policies. We evaluated the effectiveness of our framework by learning four different collaborative tasks in the construction environments. The results demonstrated the capability of our approach in enabling multiple robots to learn and adapt their behaviors in complex construction tasks while effectively preventing collisions. Results also revealed the potential of combining and exploring the advantages of reinforcement learning algorithms and inverse kinematics. The findings from this research contributed to the advancement of multi-agent reinforcement learning in the domain of construction robotics. By enabling robots to behave like human counterparts and collaborate effectively, we pave the way for more efficient, flexible, and intelligent construction processes.
- Construction & Engineering (1.00)
- Materials > Chemicals > Industrial Gases > Liquified Gas (0.46)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.46)
- Energy > Oil & Gas > Midstream (0.46)
Learning from demonstrations: An intuitive VR environment for imitation learning of construction robots
Construction robots are challenging the traditional paradigm of labor intensive and repetitive construction tasks. Present concerns regarding construction robots are focused on their abilities in performing complex tasks consisting of several subtasks and their adaptability to work in unstructured and dynamic construction environments. Imitation learning (IL) has shown advantages in training a robot to imitate expert actions in complex tasks and the policy thereafter generated by reinforcement learning (RL) is more adaptive in comparison with pre-programmed robots. In this paper, we proposed a framework composed of two modules for imitation learning of construction robots. The first module provides an intuitive expert demonstration collection Virtual Reality (VR) platform where a robot will automatically follow the position, rotation, and actions of the expert's hand in real-time, instead of requiring an expert to control the robot via controllers. The second module provides a template for imitation learning using observations and actions recorded in the first module. In the second module, Behavior Cloning (BC) is utilized for pre-training, Generative Adversarial Imitation Learning (GAIL) and Proximal Policy Optimization (PPO) are combined to achieve a trade-off between the strength of imitation vs. exploration. Results show that imitation learning, especially when combined with PPO, could significantly accelerate training in limited training steps and improve policy performance.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Robot-Enabled Construction Assembly with Automated Sequence Planning based on ChatGPT: RoboGPT
You, Hengxu, Ye, Yang, Zhou, Tianyu, Zhu, Qi, Du, Jing
Robot-based assembly in construction has emerged as a promising solution to address numerous challenges such as increasing costs, labor shortages, and the demand for safe and efficient construction processes. One of the main obstacles in realizing the full potential of these robotic systems is the need for effective and efficient sequence planning for construction tasks. Current approaches, including mathematical and heuristic techniques or machine learning methods, face limitations in their adaptability and scalability to dynamic construction environments. To expand the ability of the current robot system in sequential understanding, this paper introduces RoboGPT, a novel system that leverages the advanced reasoning capabilities of ChatGPT, a large language model, for automated sequence planning in robot-based assembly applied to construction tasks. The proposed system adapts ChatGPT for construction sequence planning and demonstrate its feasibility and effectiveness through experimental evaluation including Two case studies and 80 trials about real construction tasks. The results show that RoboGPT-driven robots can handle complex construction operations and adapt to changes on the fly. This paper contributes to the ongoing efforts to enhance the capabilities and performance of robot-based assembly systems in the construction industry, and it paves the way for further integration of large language model technologies in the field of construction robotics.
- North America > United States > Florida > Alachua County > Gainesville (0.14)
- Asia > Vietnam > Hanoi > Hanoi (0.05)
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- (2 more...)
Optimizing robotic swarm based construction tasks
Liyanage, Teshan, Fernando, Subha
Social insects in nature such as ants, termites and bees construct their colonies collaboratively in a very efficient process. In these swarms, each insect contributes to the construction task individually showing redundant and parallel behavior of individual entities. But the robotics adaptations of these swarm's behaviors haven't yet made it to the real world at a large enough scale of commonly being used due to the limitations in the existing approaches to the swarm robotics construction. This paper presents an approach that combines the existing swarm construction approaches which results in a swarm robotic system, capable of constructing a given 2 dimensional shape in an optimized manner.
- Asia > Sri Lanka > Western Province > Colombo > Colombo (0.05)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- Africa (0.04)